Re-sampling of multi-class imbalanced data using belief function theory and ensemble learning
نویسندگان
چکیده
Imbalanced classification refers to problems in which there are significantly more instances available for some classes than others. Such scenarios require special attention because traditional classifiers tend be biased towards the majority class has a large number of examples. Different strategies, such as re-sampling, have been suggested improve imbalanced learning. Ensemble methods also proven yield promising results presence class-imbalance. However, most them only deal with binary datasets. In this paper, we propose re-sampling approach based on belief function theory and ensemble learning dealing imbalance multi-class setting. This technique assigns soft evidential labels each instance. modeling provides information about object's region, improves selection objects both undersampling oversampling. Our firstly selects ambiguous undersampling, then oversamples minority through generation synthetic examples borderline regions better borders. Finally, induced results, proposed is incorporated into an classifier-independent fusion-based ensemble. The comparative study against well-known reveals that our method efficient according G-Mean F1-score measures, independently from chosen classifier.
منابع مشابه
Online Ensemble Learning for Imbalanced Data Streams
While both cost-sensitive learning and online learning have been studied extensively, the effort in simultaneously dealing with these two issues is limited. Aiming at this challenge task, a novel learning framework is proposed in this paper. The key idea is based on the fusion of online ensemble algorithms and the state of the art batch mode cost-sensitive bagging/boosting algorithms. Within th...
متن کاملA multi-class boosting method for learning from imbalanced data
The acquisition of face images is usually limited due to policy and economy considerations, and hence the number of training examples of each subject varies greatly. The problem of face recognition with imbalanced training data has drawn attention of researchers and it is desirable to understand in what circumstances imbalanced data set affects the learning outcomes, and robust methods are need...
متن کاملEvaluating Difficulty of Multi-class Imbalanced Data
Multi-class imbalanced classification is more difficult than its binary counterpart. Besides typical data difficulty factors, one should also consider the complexity of relations among classes. This paper introduces a new method for examining the characteristics of multi-class data. It is based on analyzing the neighbourhood of the minority class examples and on additional information about sim...
متن کاملA Pareto-based Ensemble with Feature and Instance Selection for Learning from Multi-Class Imbalanced Datasets
Imbalanced classification is related to those problems that have an uneven distribution among classes. In addition to the former, when instances are located into the overlapped areas, the correct modeling of the problem becomes harder. Current solutions for both issues are often focused on the binary case study, as multi-class datasets require an additional effort to be addressed. In this resea...
متن کاملEnsemble-based hybrid probabilistic sampling for imbalanced data learning in lung nodule CAD
Classification plays a critical role in false positive reduction (FPR) in lung nodule computer aided detection (CAD). The difficulty of FPR lies in the variation of the appearances of the nodules, and the imbalance distribution between the nodule and non-nodule class. Moreover, the presence of inherent complex structures in data distribution, such as within-class imbalance and high-dimensionali...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: International Journal of Approximate Reasoning
سال: 2023
ISSN: ['1873-4731', '0888-613X']
DOI: https://doi.org/10.1016/j.ijar.2023.02.006